Goto

Collaborating Authors

 classic st



where we cannot manually access and annotate a lot of data, as well as for low-resource tasks in different languages

Neural Information Processing Systems

We thank all the reviewers for their time and insightful feedback about our work. Many of the recent few-shot learning works focus on computer vision compared to NLU tasks. We leverage self-training with several advances to bridge this gap. Similar baselines reported for active learning [Gal et al., 2017] and preference learning [Houlsby et al., UDA [Xie et al., 2019] and self-training with noisy student [Xie et al., 2020] show these techniques to work best with Additionally, for IMDB longer sequence length plays a big role. Sample mixing based on easy and hard examples is an interesting idea.


Review for NeurIPS paper: Uncertainty-aware Self-training for Few-shot Text Classification

Neural Information Processing Systems

Weaknesses: My main concerns are on the experiments. While the authors make effort to perform ablation analysis, I think there are still some important missing ablations to convince me that such BNN-powerd self-training scheme is better than classic ST: (1) The proposed method always uses smart sample selection strategy while the classic ST baseline in this paper does not select samples or just select them uniformly. It is very common for classic ST to select samples based on confidence scores, which can be class-dependent as well. Thus I feel that the comparison made with classic ST is not very fair. I would like to see the comparison between UST removing Conf and classic ST with confidence-based and class-dependent sample selection, or just replace the sample selection part in full UST with confidence-score-based selection to see what happens, otherwise I don't see any direct evidence to show that the BNN-powered "uncertainty-awareness" is better than simple confidence-score-based baseline.